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1 .  INTRODUCTION 


Computer  simulation  Is  often  used  to  study  real-world  systems 
that  are  too  complex  to  be  modeled  and  analyzed  entirely  by  mathe¬ 
matical  methods.  Unfortunately,  simulation  models  used  to  study 
large,  complex  systems  tend  to  be  extremely  large  and  complex  them¬ 
selves,  and  corresponding  computer  codes  (programs)  to  execute  these 
models  are  generally  very  long  running.  As  a  consequence,  users  of 
large-scale  simulations  are  often  overwhelmed  by  the  vast  number  of 
factors  (i.e.,  input  variables)  contained  in  the  model  and  are  con¬ 
fused  about  how  to  make  an  effective  analysis  of  the  system  model 
without  having  to  perform  an  excessive  number  of  costly  and  time- 
consuming  simulation  runs.  Methods  of  shortcutting  these  cost  and 
time  elements  are  essential  if  any  fruitful  simulation  experimentation 
is  to  take  place.  In  such  situations,  the  use  of  factor  screening 
methods  can  substantially  reduce  the  total  number  of  computer  runs  re¬ 
quired  to  study  the  system  model. 

Factor  screening  methods  (see,  for  example,  [  1  ],  [  7  ],  and  [  9  ]) 
are  statistical  methods  that  attempt  to  identify,  economically  and  ef¬ 
ficiently,  a  set  of  "most  important"  factors.  Once  the  more  important 
factors  have  been  identified,  subsequent  simulation  experimentation  can 
be  focused  on  these  critical  factors,  thereby  eliminating  experimentation 
with  relatively  negligible  factors  which  can  needlessly  consume  resources. 
Although  factor  screening  methods  are  applicable  to  experimentation  in 
general,  computer  simulation  offers  an  especially  fertile  area  of  appli¬ 
cation  for  these  techniques  for  at  least  two  reasons:  (a)  the  large 
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number  of  factors  generally  built  into  complex  simulation  models,  and 
(b)  the  scarcity  of  computer  runs  that  often  handicaps  planned  simu¬ 
lation  experimentation. 


Under  an  Office  of  Naval  Research  contract,  Desmatics,  Inc.  has 
conducted  extensive  research  in  this  area.  As  part  of  its  research 
effort,  Desmatics  has  selected  two  primary  screening  strategies  for 
intensive  study.  These  two  strategies  are  Random  Balance  (RB)  and  Two- 
Stage  Group  Screening  (GS).  In  earlier  technical  reports  the  respec¬ 
tive  performance  characteristics  of  these  two  strategies  were  evalu¬ 
ated.  The  present  report  is  the  first  of  two  technical  reports  to  com¬ 
pare  directly  the  performance  of  the  RB  and  GS  strategies. 

2.  MODEL  ASSUMPTIONS 


Suppose  that  K  factors,  each  at  two  levels  (±1),  are  to  be 
screened  for  their  effect  on  the  simulation  response  (i.e.,  output 
variable).  For  detecting  the  factors  having  major  effects  it  is  gen¬ 
erally  reasonable  to  assume  the  first-order  model 


K 

L 

j-1 


Vu 


+  e. 


(2.1) 


where  y^  is  the  value  of  the  response  in  the  i —  simulation  run,  x^ 
is  the  level  (±1)  of  the  j—  factor  in  the  i—  simulation  run,  6j  is 

t  h  5 

the  (linear)  effect  of  the  j —  factor,  and  the  are  i.i.d.  N(0,o  ) 
random  disturbances,  a2  >  0  unknown.  Ordinarily  we  would  use  model 
(2.1)  over  a  relatively  small  region  of  the  factor  space. 


In  this  report  we  make  the  following  additional  simplifying  assump¬ 


tions: 

(i)  k  >_  1  (k  unknown)  of  the  K  factors  are  active  (i.e.,  have 
a  nonzero  effect)  and  the  remaining  (K-k)  factors  are  in¬ 
active,  and 

(ii)  all  active  factors  have  the  same  absolute  effect,  A  >  0, 
that  is, 

—  th 

A,  if  the  j —  factor  is  active 

IM  -<  th 

J  j^O,  if  the  j —  factor  is  inactive. 

We  let  £(i)  for  i  ■  0,1,.. .,k  denote  the  case  in  which  (of  the 
k  active  effects)  i  active  effects  are  equal  to  -A  and  the  remaining 
(k-i)  active  effects  are  equal  to  4-A.  The  (K-k)  inactive  effects  are, 
by  definition,  equal  to  zero.  We  note  that  the  JKO)  case,  or  8(k)  case, 
corresponds  to  the  situation  in  which  all  k  active  effects  are  in  the 
same  direction.  In  practice,  of  course,  the  direction  will  be  known  for 
some  suspected  effects  and  unknown  for  others.  Lastly,  we  define  the 
signal-to-noise  ratio  as  the  ratio  of  A  to  the  error  standard  deviation 
a,  A/cr. 


3.  THE  RB  AND  GS  STRATEGIES 

In  this  section  we  review  briefly  the  RB  and  GS  strategies.  These 
strategies  are  discussed  more  fully  in  [  2  ]  and  (  6  ].  In  addition,  we 
define  three  basic  measures  of  performance  which  we  will  use  in  Section 
4  to  compare  screening  performance. 
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3.1.  Random  Balance 


In  a  two-level  (±1)  RB  design  for  studying  K  factors,  each  column 
of  the  design  matrix  consists  of  N/2  +l's  and  N/2  -I's  where  N  (an  even 
number)  denotes  the  total  number  of  runs  to  be  made.  In  each  design 
column  the  +l’s  and  -l's  are  allocated  randomly,  making  all  possible 
combinations  of  N/2  +l's  and  N/2  -l's  (there  are  equally 

likely,  with  each  column  receiving  an  independent  randomization.  To 
analyze  RB  designs  we  apply  a  standard  F-test  separately  to  each  factor, 
ignoring  all  other  factors.  Furthermore,  we  conduct  each  F-test  at  the 
same  level  of  significance,  say  ot^.  Our  RB  strategy,  therefore,  is 
completely  specified  by  N  and  a^.  Accordingly,  we  denote  such  a  screening 
strategy  by  RB(N,  a^) . 

3.2.  Two-Stage  Group  Screening 

In  this  strategy  we  partition  the  K  factors  randomly  into  G  groups 
of  size  g;  if  K  is  not  a  multiple  of  g,  we  assume  that  the  group  sizes 
are  taken  as  "evenly"  as  possible.  Then,  by  assigning  the  same  level 
(+1  or  -1)  to  all  component  factors  within  each  group,  we  test  the  group 
factors  as  if  they  were  single  factors.  All  factors  in  groups  found  to 
have  a  significant  effect  are  subsequently  studied  in  a  second-stage 
experiment . 

In  the  first  and  second  stage  experiments  we  use  the  multifactorial 
designs  of  Plackett  and  Burman  (8  J.  These  designs  are  specially  con¬ 
structed  two-level  orthogonal  desings  for  studying  up  to  (4m-l)  factors 


in  4m  runs.  The  number  of  runs  required  by  the  smallest  Placket t- 
Burman  (PB)  design  to  study  s  factors  is  given  mathematically  by 


B(s)  =  s+4  -  s(mod  4). 

PB  designs  can  be  analyzed  by  the  usual  analysis  of  variance  pro¬ 
cedures  for  factorial  experiments.  Note,  however,  that  when  the  number 
of  factors  is  one  less  than  a  multiple  of  four,  no  degrees  of  freedom 
are  left  to  estimate  experimental  error  (a).  It  is  advisable,  therefore, 
for  the  study  of  s  factors  (or  group-factors)  to  employ  the  PB  design  in 
B(s+1)  runs.  This  would  result  in  a  minimum  of  one  and  a  maximum  of 
four  error  degrees  of  freedom. 

If  we  let  and  a2  denote  the  levels  of  the  significance  tests 
performed  at  the  end  of  the  first  and  second  stages,  respectively,  our 
GS  strategy  is  completely  specified  by  g,  cij,  o^.  We  denote  such  a 
strategy  by  GS(g,  o^,  a2>. 

3.3.  Performance  Measures 

With  regard  to  model  (2.1)  and  the  additional  simplifying  assump¬ 
tions  we  have  made,  we  can  define  three  basic  measures  of  screening  ef¬ 
fectiveness.  These  are: 

Power.  We  denote  by  A  the  number  of  active  factors  that 
are  detected  correctly,  and  we  define  =  100E(A)/k  as 
a  percentage  measure  of  the  power  (or  sensitivity)  to 
detect  the  active  factors. 
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Type  I  Error.  We  denote  by  U  the  number  of  inactive 


factors  that  are  declared  active,  and  we  define  * 

100E(U) /(K-k)  as  a  percentage  measure  of  Type  I  error. 

Relative  Testing  Cost.  We  denote  by  R  the  total 
number  of  runs  required  by  an  RB  or  GS  strategy.  We 
define  E^  =  100E(R) /B(K+1 )  as  a  percentage  measure  of 
expected  relative  testing  cost,  where  B(K+1)  denotes  the 
number  of  runs  required  by  a  PB  design  for  (K+l)  factors. 

In  references  [  2  ],  [  3  },  and  [  6  ],  formulas  are  given  with  which 
to  calculate  E^,  E^,  and  E^  for  any  GS(g,  a^,  a^)  or  RB(N,  a^)  strategy. 
We  apply  these  results,  as  needed,  in  the  following  section  in  making  our 
comparative  performance  study. 


4.  RESULTS  OF  COMPARISON  STUDY 


In  our  investigation  we  examined  the  following  eight  combinations 
of  (K,  k,  A /a): 


K  *  60  and  240, 
k  “  ( I / 1 5)K  and  (4/15)K, 
and  A/o  =  2  and  4. 


Further,  for  each  of  these  eight  cases  we  considered  the  following  eight 
combinations  of  Type  I  Error  and  expect’d  numbi  r  r  f  runs: 


and  E(R) 


Eu  =  10%  and  20% 

~12,  26,  38,  52  runs 
_46,  100,  144,  198  runs 


for  K*60 
for  K-240. 


We  chose  these  particular  run  numbers  to  correspond  as  closely  as 

possible  to  expected  relative  testing  costs  (E  )  of  20%,  40%,  60%  and 

80%.  We  could  not  specify  run  numbers  that  gave  exact  correspondence 

between  E  and  these  levels.  The  reason  for  this  is  that  in  an  RB(N,  a  ) 

K  r 

strategy  the  total  number  of  runs  required,  R,  is  constant  and  is  pre¬ 
cisely  N,  which  must  of  course  be  kept  even.  Consequently,  E(R)  had  to 
be  restricted  to  even  numbered  runs  only.  A  quick  calculation  will  show 
that  for  K=60  factors,  we  consider  E^  *=  18.75%,  40.63%,  59.38%,  and 
81.25%;  for  K=240  factors,  we  consider  E^  =  18.85%,  40.98%,  59.02%, 
and  81.15%. 

In  all,  then,  we  considered  64  treatment  combinations  of  K,  k,  A/o, 

Ey,  and  E^.  For  each  treatment  combination,  we  determined  that  RB(N,  a^) 
and  GS(g,  a^,  a^)  strategies  that  maximize  power  EA>  and  thus  are  "optimal" 
in  this  sense.  For  GS  strategies  we  did  this  separately  for  the  J3(0) 
and  j3([k/2])  cases.  The  £(0)  case  represents  the  "best"  case  situation 
for  group  screening  since  no  cancellation  of  active  effects  can  occur 
within  groups.  The  B ( [ k/2 ] )  case,  on  the  other  hand,  represents  the 
"worst"  case  situation  for  group  screening  since  in  this  case  the  chance 
of  group-factor  cancellation  of  active  effects  is  greatest  (among  all 
j3(i)  cases).  In  contrast,  the  sensitivity  of  an  RB(N,  a^)  strategy  is 
the  same  for  all  j?(i)  cases,  i  *  0,1,..., k. 

The  optimal  RB(N ,  af)  strategy  for  a  given  (K,  k,  A/o)  condition  can 

be  quite  readily  determined.  Mauro  and  Smith  [  6  ]  have  shown  that  the 

Type  I  error  of  an  RB(N,  a^)  strategy  is  very  closely  approximated  by 

a Furthermore,  the  power  of  an  RB(N,  a^)  strategy  increases  as  either 

N  or  ci  increases.  It  follows  that  for  a  given  (K,  k,  A/o,  E ,  E  ) 
r  uk 
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condition  the  optimal  RB(N,  ar)  strategy  is  simply  the  RB(E(R),  Ey) 
strategy. 

The  optimal  GS(g,  c^»  a2)  strategy  is  much  more  difficult  to  de¬ 
termine.  Mauro  [2  3  and  Mauro  and  Burns  [4  ],  however,  have  developed 
a  computer-aided  search  routine  that  can  be  used  to  determine  the  op¬ 
timal  CS(g,  0^,  a 2)  strategy  under  the  same  model  assumptions  we  have 
made  in  this  paper.  Moreover,  the  algorithm  treats  both  the  J3(0)  and 
6([k/2])  cases.  Accordingly,  this  search  routine  was  applied  to  de¬ 
termine  the  optimal  GS(g,  ot^,  o^)  strategy  in  each  of  the  64  experimental 
conditions. 

The  corresponding  powers  of  the  optimal  RB(N,  a^)  and  GS(g,  a^,  a^) 
strategies  are  presented  for  easy  comparison  in  Tables  1  and  2.  Table 
1  summarizes  the  results  for  K=60  factors  and  Table  2  does  so  for  K=240 
factors.  The  values  of  g,  ot  ^ ,  Ot  ^  that  define  the  optimal  GS(g,  cij,  a  ^ 
strategies  are  not  given  in  these  tables  but  are  listed  in  the  appendix. 
For  notational  purposes  and  convenience  of  presentation,  we  define  GS.i 
for  i  3  0,l,..,,k  to  be  the  optimal  GS(g,  a ^ ,  a^)  strategy  in  the  £(i) 
case.  Although  we  only  consider  i=0  and  i=[k/2],  it  is  clear  because 
of  symmetry  considerations  that  the  GS.i  strategy  is  equivalent  to  and 
has  the  same  power  as  the  GS.(k-i)  strategy. 

5.  DISCUSSION 

As  noted  previously,  j?(0)  and  8([k/2])  are  the  "best"  and  "worst" 
case  situations,  respectively,  for  group  screening.  Consequently,  the 
power  corresponding  to  the  GS.O  strategy  is  always  greater  than  that 
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Table  1.  Power  Comparisons  of  Optimal  RB  and  GS  Strategies  for  K=60  Factors. 
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Table  2.  Power  Comparisons  of  Optimal  RB  and  GS  Strategies  for  K=240  Factors 


corresponding  to  the  GS.[k/2]  strategy.  As  seen  from  Tables  1  and  2, 
this  difference  in  power  becomes  greater  as  expected  relative  testing 
cost  increases. 

Further  inspection  of  Tables  1  and  2  reveals  that  A/a,  over  the 
range  we  considered,  had  little  effect  on  the  powers  of  the  optimal  RB 
strategies  and  virtually  no  effect  on  the  powers  of  the  optimal  GS  stra¬ 
tegies.  Effectively,  therefore,  we  can  ignore  Tables  lc.  Id,  2c,  and  2d 
and  restrict  attention  to  Tables  la,  lb,  2a,  and  2b  (or  vice-versa). 

We  suspect,  though,  that  had  we  considered  a  signal-to-noise  ratio  some¬ 
what  smaller  than  two  (say  A/o  =  1),  there  would  have  been  some  loss  in 
power  compared  with  A/a  =  2  and  4.  However,  signal-to-noise  ratios  less 
than  two  are  probably  not  of  practical  interest  in  screening  situations. 

From  a  comparison  of  Tables  la  with  2a  and  lb  with  2b,  it  is  readily 
seen  that  the  powers  of  the  optimal  RB  strategies  depend  on  K  and  k 
basically  only  through  p  =  k/K,  the  proportion  of  active  factors  to  the 
total  number  of  factors.  This  simple  relationship  apparently  does  not 
hold  for  GS  strategies. 

In  this  study  p  ranges  from  6.7%  (=1/15)  to  26.7 %  (=4/15).  We  see 
from  the  tables  that  the  optimal  RB  and  GS  strategies  have  greater  power 
when  p  =  6.7%.  This  observation  is  in  accordance  with  the  notion  that 
factor  screening  is  more  effective  for  a  given  K  in  the  presence  of  fewer 
active  factors  (i.e.,  for  smaller  p).  It  can  also  be  seen  from  the  tables 
that  the  drop  in  power  as  p  increases  from  6.7%  to  26.7%  is  more  extreme 
for  the  optimal  RB  strategies  than  for  the  optimal  GS  strategies. 

Continuing,  it  is  clear  that  both  Type  I  error  and  relative  testing 
cost  have  a  strong  influence  on  the  powers  of  the  optimal  RB  strategies. 


V 


For  the  optimal  GS  strategies,  relative  testing  cost  similarly  has  a 
major  effect  on  power.  Type  I  error,  however,  has  little  effect  on  the 
powers  corresponding  to  optimal  GS  strategies.  This  result  was  some¬ 
what  unexpected  and  we  investigated  this  phenomenon  further  for  a  few 
selected  conditions.  Surprisingly  for  these  cases  we  found  relatively 
little  loss  in  power  for  the  optimal  GS  strategies  with  Type  I  error 
rates  as  low  as  1%.  On  the  other  hand,  small  Type  I  error  rates  generally 
have  a  debilitating  effect  on  the  use  of  RB  strategies. 

For  the  remainder  of  this  section  we  shall  attempt  to  discuss  the 
relative  merits  of  each  screening  strategy.  This  discussion  should  pro¬ 
vide  some  guidance  and  insight  into  the  use  and  selection  of  these  two 
techniques  for  factor  screening. 

A  primary  question  of  interest  is  for  what  combinations  of  Type  I 

error  and  expected  relative  testing  cost  should  one  consider  the  use  of 

an  RB  strategy  rather  than  a  GS  strategy.  Over  the  range  of  Type  I  error 

considered,  the  optimal  RB  strategy  is  "better"  (i.e.,  has  greater  power) 

for  low  expected  relative  testing  cost,  and  the  optimal  GS  strategy  is 

better  for  high  expected  relative  testing  cost.  In  the  j3(0)  case,  the 

crossover  for  E  =  10%  is  about  E  ■  45%  and  for  E  =  20%  the  crossover 
U  K  U 

is  about  »  60%.  The  crossover,  however,  varies  widely  with  K  and  k. 

In  general,  the  crossover  decreases  (increases)  as  either  K  or  k  de¬ 
creases  (increases).  In  the  0([k/2])  case  the  crossover  shifts  upward 
about  15%. 

Preliminary  extrapolation  studies  have  indicated  that  the  cross¬ 
over,  where  the  optimal  GS  strategy  becomes  better  than  the  optimal  RB 
strategy,  is  smaller  (larger)  for  smaller  (larger)  levels  of  Type  I 
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error.  This  would  therefore  suggest  that  the  optimal  GS  strategy  has 
an  advantage  over  the  optimal  RB  strategy  at  low  Type  I  error  rates 
but  begins  to  lose  this  advantage  as  one  considers  screening  at  higher 
Type  I  error  rates.  Of  course,  in  a  particular  screening  application, 
it  is  up  to  the  analyst  to  make  the  appropriate  compromises  between 
Type  I  error,  relative  testing  cost,  and  power. 

There  are  two  very  important  practical  considerations  that  should 
be  noted.  The  first  of  these  is  that  the  total  number  of  screening  runs 
required  by  an  RB(N,  a )  strategy  is  fixed  prior  to  experimentation.  In 
an  GS(g,  Oij,  strategy,  the  total  number  of  runs  required  is  random. 

The  RB  strategy,  therefore,  offers  greater  control  over  the  number  of 
screening  runs  that  will  be  expended. 

The  second  and  perhaps  most  important  consideration  is  that  for  a 
given  expected  relative  testing  cost,  determination  of  the  optimal 
GS(g,  Oj,  a2>  strategy  requires  prior  knowledge  of  k.  A/a,  and  the  num¬ 
ber  of  active  effects  in  the  positive  direction.  On  the  other  hand,  de¬ 
termination  of  the  optimal  RB(N,  a^)  strategy  does  not  require  this,  or 
any  other,  prior  knowledge.  Consequently,  any  advantages  of  the  optimal 
GS  strategies  (as  indicated  in  the  tables  and  discussed  so  far)  may  be 
offset  by  losses  in  power  due  to  imprecise  prior  knowledge.  We  examine 
this  potential  hazard  in  more  detail  in  the  following  section. 

6.  PRACTICAL  CONSIDERATIONS 

A  desirable  feature  of  any  factor  screening  procedure  is  the  ability 
to  control  Type  I  error  and  expected  relative  testing  cost.  As  indicated 
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previously,  this  is  always  possible  with  an  RB(N,  af)  strategy.  With 
a  GS(g,  otj,  o^)  strategy,  however,  this  control  is  possible  only  with 
prior  knowledge  of  k.  A/a,  and  JJ(i).  An  important  question,  therefore, 
is  to  what  extent  does  imprecision  in  this  prior  knowledge  affect  the 
performance  of  a  GS(g,  ctj ,  o^)  strategy.  In  this  section  we  attempt 
to  answer  this  question  through  the  use  of  two  case  studies.  These 
examples  will  serve  to  illustrate  the  practical  difficulties  associated 
with  the  application  of  the  GS  strategy. 

In  the  first  case  study,  we  assume  that  there  are  K“60  factors  to 
be  screened  for  their  effect  on  the  response.  In  addition,  suppose 
that  we  wish  to  control  our  Type  I  error  at  10%  and  expected  relative 
testing  cost  at  59.4%  (equivalently,  E(R)  «  38  runs).  Further  suppose 
that  our  prior  knowledge  tells  us  to  expect  that  k“16  factors  are  active, 
A/o  ■  2,  and  all  active  effects  are  in  the  same  direction.  We  see  from 
the  appendix  that  the  optimal  GS  strategy  for  this  situation  is  the 
GS(7,  0.00325,  0.29673)  strategy.  Suppose  for  the  moment,  however, 
that  our  prior  knowledge  is  not  entirely  accurate.  In  Table  3  we  give 
the  performance  of  the  GS(7,  0.00325,  0.29673)  strategy  for  all  com¬ 
binations  of  k,  A/a,  and  j$(i)  for  k*8, 12,16,  A/a=2,4  and  j3(i)»6(0), 
B([k/2]). 

In  the  second  case  study  we  assume  that  K»240  factors  are  to  be 
screened.  Once  again  suppose  that  we  wish  to  control  Type  I  error  at 
10%  and  expected  relative  testing  cost  at  59.0%  (equivalently,  E(R)  - 
144  runs),  and  suppose  our  prior  knowledge  tells  us  to  expect  that 
k-16,A/a-4,  and  j3(i)*S(0).  From  the  appendix,  the  optimal  GS  strategy 
for  this  situation  is  the  GS(3,  0.05907,  0.55223)  strategy. 
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£(i) 
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60 
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8(0) 

A. A 

38.9 

35.6 

60 

8 

2 

8(A) 

2.9 

32.7 

18.0 

60 

8 

4 

8(0) 

10. A 

59.  A 

71.5 

60 

8 

A 

6(A) 

7.8 

A9.3 

44.6 

60 

12 

2 

6(0) 

7.1 

A9.0 

A6.6 

60 

12 

2 

6(6) 

A.O 

36.6 

20.8 

60 

12 

A 

6(0) 

1A.9 

7A.6 

80.8 

60 

12 

A 

6(6) 

10.1 

56.7 

A7.1 

60 

16 

2 

6(0) 

10.0 

59. A 

56.6 

60 

16 

2 

6(8) 

5.0 

40.1 

23. A 

60 

16 

4 

6(0) 

18.7 

87.0 

87.5 

60 

16 

A 

6(8) 

11.9 

62.2 

A9.6 

\ 
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Table  3.  Performance  Results  for  GS(7,  0.00325,  0.29673) 
Strategy.  All  Results  Are  Expressed  as  Percen¬ 
tages  . 
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In  Table  4  we  give  the  performance  of  this  particular  GS  strategy 
for  all  combinations  of  k,  A/a,  and  6(i)  for  k-16,24,32,  A/o»2,4,  and 
£(i)-£(0),8([k/2]). 

As  can  be  seen  from  Tables  3  and  4,  Type  I  error  and  expected 
relative  testing  cost  can  deviate  greatly  from  their  intended  values, 
although  they  always  move  in  the  same  direction.  From  a  practical 
standpoint,  these  results  indicate  that  the  use  of  imprecise  prior 
knowledge  can  have  rather  undesirable  consequences.  Underestimating 
the  number  of  active  factors  results  in  greater  type  I  error  and  greater 
expected  relative  testing  cost  than  desired.  Overestimating  the  number 
of  active  factors  has  the  reverse  effect.  Certainly,  this  is  a  major 
practical  drawback  to  group  screening  as  a  technique  for  factor  screening. 

7.  CONCLUSIONS  AND  SUMMARY 

In  this  paper  we  attempt  to  compare  the  efficacy  and  relative 
merits  of  a  two-stage  group  screening  (GS)  strategy  versus  a  random 
balance  (RB)  screening  strategy.  We  assume  a  screening  model  in  which 
the  active  (i.e.,  nonzero)  effects  are  additive  and  have  the  same  ab¬ 
solute  magnitude.  Accordingly,  this  model  is  most  appropriate  when  it 
is  expected  that  a  relatively  small  number  of  factors  (i.e.,  inputs) 
have  a  major  effect  on  the  response  (i.e.,  output)  and  the  remaining 
factors  have  a  negligible  effect.  In  such  situations,  the  objectives 
of  a  screening  strategy  are  to  detect  as  many  of  the  "important"  fac¬ 
tors  as  possible,  to  declare  important  as  few  "unimportant"  factors  as 
possible,  and  to  perform  as  few  computer  runs  as  possible. 
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K 

k 

A/a 

£(i> 

4 

4 

4 

240 

16 

2 

3(0) 

10.0 

59.0 

100.0 

240 

16 

2 

6(8) 

9.9 

58.5 

94.1 

240 

16 

4 

6(0) 

10.0 

59.0 

100.0 

240 

16 

4 

6(8) 

9.9 

58.5 

94.1 

240 

24 

2 

6(0) 

13.2 

66.8 

100.0 

240 

24 

2 

6(12) 

12.9 

65.6 

91.4 

240 

24 

4 

6(0) 

13.2 

66.8 

100.0 

240 

24 

4 

6(12) 

12.9 

65.6 

91.4 

240 

32 

2 

6(0) 

16.3 

74.1 

100.0 

240 

32 

2 

6(16) 

15.8 

71.9 

89.0 

240 

32 

4 

K0) 

16.3 

74.1 

100.0 

240 

32 

4 

6(16) 

15.8 

71.9 

89.0 

Table  4.  Performance  Results  for  GS(3,  0.05907,  0.55223) 
Strategy.  All  Results  Are  Expressed  As  Percen¬ 
tages  . 
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In  Sections  4  and  5  we  compare  the  "optimal"  RB  strategy  with 
the  "optimal"  GS  strategy  for  a  number  of  experimental  conditions. 

We  found  that  the  optimal  GS  strategy  is  generally  better  than  the 
optimal  RB  strategy  at  low  Type  I  error  rates  but  begins  to  lose  its 
advantage  as  one  considers  screening  at  higher  Type  I  error  rates. 

For  example,  at  a  controlled  Type  I  error  rate  of  10%,  the  optimal 
RB  strategy  is  better  than  the  optimal  GS  strategy  when  expected  rela¬ 
tive  testing  cost  is  less  than  (approximately)  45%,  at  least  over  the 
conditions  we  examined. 

Determination  of  the  optimal  GS  strategy,  however,  requires  prior 
knowledge  of  the  number  of  active  factors,  the  signal-to-noise  ratio 
of  the  active  effects,  and  the  directions  of  the  active  effects.  The 
RB  screening  technique  requires  no  such  prior  knowledge.  We  discuss 
the  effects  of  imprecise  prior  knowledge  on  the  group  screening  method 
in  Section  6.  The  analysis  of  this  section  indicates  that  inaccurate 
prior  knowledge  can  have  undesirable  consequences  on  screening  perfor¬ 
mance  in  that  one  cannot  control  the  resulting  Type  I  error  rate  and 
expected  relative  testing  cost.  This  is  a  major  drawback  to  the  use 
of  the  GS  strategy  as  a  technique  for  factor  screening.  More  impor¬ 
tantly,  this  apparent  lack  of  "robustness"  severely  limits  the  prac¬ 
ticality  of  the  GS  strategy.  It  remains  to  be  seen,  however,  how  the 
RB  and  GS  strategies  compare  in  the  framework  where  the  active  effects 
are  not  necessarily  assumed  to  be  of  the  same  absolute  magnitude.  We 
consider  this  more  general  situation  in  Part  II  of  this  technical  report. 
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8.  APPENDIX 


Listed  below  and  on  the  following  page  are  the  optimal  GS  strat¬ 


egies  as  determined  by  computer-aided  search  routine  for  the  experi¬ 
mental  conditions  described  in  Section  4.  The  values  of  and 
are  rounded  to  five  decimal  places. 


K 

k 

A/q 

2 

E,. 

E(R) 

12 

GS.O 

GS.[k/2) 

60 

4 

— U 

10% 

GS(30,  0.00747,  1.00000) 

GS(30,  0.01811,  1.00000) 

26 

GS(7,  0.01097,  0.59142) 

GS(7,  0.01447,  0.57278) 

38 

GS(5,  0.02248,  0.38328) 

GS(4,  0.01200,  0.51311) 

52 

GS (3,  0.26978,  0.27305) 

GS(2,  0.16950,  0.44325) 

60 

4 

2 

20% 

12 

GS(30,  0.00747,  1.00000) 

GS ( 30 ,  0.01811,  1.00000) 

26 

GS(7,  0.01097,  1.00000) 

GS{7,  0.01447,  1.00000) 

38 

GS(5,  0.02248,  0.76656) 

GS(4 ,  0.01200,  1.00000) 

52 

GS(3,  0.26978,  0.54610) 

GS(2,  0.16950,  0.88649) 

60 

16 

2 

10% 

12 

GS(30,  0.00188,  1.00000) 

GS(30,  0.00927,  1.00000) 

26 

GS ( 1 2 ,  0.00078,  0.45991) 

CS(12,  0.00432,  0.42828) 

38 

GS(7,  0.00325,  0.29673) 

GS ( 1 2 ,  0.01265,  0.23161) 

52 

GS(7,  0.00930,  0.17649) 

GS ( 1 2 ,  0.04630,  0.14983) 

60 

16 

2 

20% 

12 

GS(30,  0.00188,  1.00000) 

CS(30,  0.00927,  1.00000) 

26 

GS ( 1 2 ,  0.00078,  0.91982) 

GS ( 1 2 ,  0.00432,  0.85657) 

38 

GS(7,  0.00325,  0.59346) 

GS ( 1 2 ,  0.01265,  0.46323) 

52 

GS(7 ,  0.00930,  0.35298) 

GS ( 1 2 ,  0.04630,  0.29848) 

60 

4 

4 

10% 

12 

GS ( 30 ,  0.00375,  1.00000) 

GS ( 30  ,  0.00947,  1.00000) 

26 

GS(7,  0.00289,  0.59421) 

GS ( 7 ,  0.00387,  0.57601) 

38 

GS(5,  0.01153,  0.38570) 

GS (4 ,  0.01021,  0.51404) 

52 

GS(2,  0.16698,  0.44752) 

GS (2 ,  0.16911,  0.44358) 

60 

4 

4 

20% 

12 

GS ( 30 ,  0.00375,  1.00000) 

GS ( 30 ,  0.00947,  1.00000) 

26 

GS(7,  0.00289,  1.00000) 

GS ( 7 ,  0.00387,  1.00000) 

38 

GS(5,  0.01153,  0.77139) 

GS (4 ,  0.01021,  1.00000) 

52 

GS(2,  0.16698,  0.89504) 

CS(2,  0.16911,  0.88716) 

60 

16 

4 

10% 

12 

GS(30,  0.00094,  1.00000) 

CS(30,  0.00470,  1.00000) 

26 

GS( 12,  0.00019,  0.45997) 

GS ( 1 2 ,  0.00109,  0.42858) 

38 

GS(7,  0.00082,  0.29697) 

GS (12,  0.00324,  0.23166) 

52 

GS(7,  0.00235,  0.17660) 

GS( 12,  0.01238,  0.14986) 

60 

16 

4 

20% 

12 

GS( 30,  0.00094.  1.00000) 

GS ( 30 ,  0.00470,  1.00000) 

26 

GS < 1 2 ,  0.00019.  0.91993) 

GS ( 1 2 ,  0.00109,  0.85716) 

38 

GS ( 7 ,  0.00082,  0.59394) 

GS ( 1 2 ,  0.00324,  0.46332) 

52 

GS(7,  0.00235,  0.35319) 

GS (12,  0.01238.  0.29971) 
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K 

k 

A/o 

102 

E(R) 

GS.O 

GS. [k/2] 

240 

16 

2 

46 

GS(22,  0.00001,  0.98824) 

GS (20,  0.00027,  0.94384) 

100 

GS(6.  0.00083,  0.53956) 

GS(6,  0.00113,  0.52542) 

144 

GS(3,  0.05907,  0.55223) 

GS(3,  0.06633,  0.53966) 

198 

GS(2,  0.18920,  0.41071) 

GS(2,  0.19125,  0.40751) 

240 

16 

2 

202 

46 

GS(22,  0.00001,  1.00000) 

GS(20,  0.00027,  1.00000) 

100 

GS(6 i  0.00083,  1.00000) 

GS(6,  0.00113,  1.00000) 

144 

GS(3,  0.05907,  1.00000) 

GS(3,  0.06633,  1.00000) 

198 

GS<2,  0.18920,  0.82143) 

GS(2,  0.19125,  0.81502) 

240 

64 

2 

102 

46 

GS(27,  0.00005,  0.82463) 

GS(27,  0.00052,  0.79563) 

100 

GS( 16,  0.00000,  0.35317) 

GS(27,  0.00228,  0.28641) 

144 

GS(9,  0.00001,  0.25538) 

GS(27,  0.00599,  0.18749) 

198 

GS(9,  0.00004,  0.16109) 

GS(40,  0.10492,  0.12894) 

240 

64 

2 

202 

46 

GS(27,  0.00005,  1.00000) 

GS(27,  0.00052,  1.00000) 

100 

GS ( 1 6 ,  0.00000,  0.70634) 

GS(27,  0.00228,  0.57282) 

144 

GS(9,  0.00001,  0.51075) 

GS ( 2 7 ,  0.00599,  0.37497) 

198 

GS(9,  0.00004,  0.32217) 

GS(40,  0.10492,  0.25788) 

240 

16 

4 

102 

46 

GS(22,  0.00000,  0.98796) 

GS(20,  0.“>0003,  0.94505) 

100 

GS(6,  0.00011,  0.53985) 

GS (6 ,  0.00014,  0.52574) 

144 

GS(3,  0.05907,  0.55223) 

GS(3,  0.06633,  0.53966) 

198 

GS(2,  0.18920,  0.41071) 

GS(2,  0,19125.  0.40751) 

240 

16 

4 

202 

46 

GS(22,  0.00000,  I. 00000) 

GS(20,  0.00003,  1.00000) 

100 

GS(6,  0.00011,  1.00000) 

GS(6,  0.00014,  1.00000) 

144 

GS(3,  0.05907,  1.00000) 

GS (3 ,  0.06633,  1.00000) 

198 

GS(2,  0.18920,  0.82143) 

GS (2 ,  0.19125,  0.81502) 

240 

64 

4 

102 

46 

GS(27,  0.00001,  0.82351) 

GS ( 2 7 ,  0.00013,  0.79592) 

100 

GS ( 1 6 ,  0.00000,  0.35314) 

GS ( 2 7 ,  0.00057,  0.28646) 

144 

GS(9,  0.00000,  0.25543) 

GS ( 2 7 ,  0.00151,  0.18748) 

198 

CS(9,  0.00000,  0.16111) 

GS (40,  0.05425.  0.12894) 

240 

64 

4 

202 

46 

GS(27,  0.00001,  1.00000) 

GS ( 2 7 ,  0.00013,  1.000U0) 

100 

GS ( 1 6 ,  0.00000,  0.70629) 

GS(27,  0.00057,  0.57292) 

144 

GS(9,  0.00000,  0.51086) 

GS ( 2 7 ,  0.00015,  0.37496) 

198 

CS(9,  0.00000,  0.32222) 

GS(40,  0.05425,  0.25789) 
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